Mining Top-K Co-Occurrence Items
نویسنده
چکیده
—Frequent itemset mining has emerged as a fundamental problem in data mining and plays an important role in many data mining tasks, such as association analysis, classification, etc. In the framework of frequent itemset mining, the results are itemsets that are frequent in the whole database. However, in some applications, such recommendation systems and social networks, people are more interested in finding out the items that occur with some user-specified itemsets (query itemsets) most frequently in a database. In this paper, we address the problem by proposing a new mining task named top-k co-occurrence item mining, where k is the desired number of items to be found. Four baseline algorithms are presented first. Then, we introduce a special data structure named Pi-Tree (Prefix itemset Tree) to maintain the information of itemsets. Based on Pi-Tree, we propose two algorithms, namely PT (Pi-Tree-based algorithm) and PT-TA (Pi-Tree-based algorithm with TA pruning), for mining top-k co-occurrence items by incorporating several novel strategies for pruning the search space to achieve high efficiency. The performance of PT and PT-TA was evaluated against the four proposed baseline algorithms on both synthetic and real databases. Extensive experiments show that PT not only outperforms other algorithms substantially in terms execution time but also has excellent scalability.
منابع مشابه
Ensemble-based Top-k Recommender System Considering Incomplete Data
Recommender systems have been widely used in e-commerce applications. They are a subclass of information filtering system, used to either predict whether a user will prefer an item (prediction problem) or identify a set of k items that will be user-interest (Top-k recommendation problem). Demanding sufficient ratings to make robust predictions and suggesting qualified recommendations are two si...
متن کاملEfficient Fuzzy Apriori Association Rule Mining to Find Co-occurance Relationship
Department of Computer Science & Engineering/Shriram College of Engineering & Management [SRCEM] Banmore, Gwalior (MP)/India, 474003 _______________________________________________________________________________________ Abstract: Data mining is sorting through data to identify patterns and establish relationships. Association rule mining is a well established method of data mining that identif...
متن کاملFlexible Mining of Association Rules
The discovery of association rules showing conditions of data co-occurrence has attracted the most attention in data mining. An example of an association rule is the rule “the customer who bought bread and butter also bought milk,” expressed by T(bread; butter)→T(milk). Let I ={x1,x2,...,xm} be a set of (data) items, called the domain; let D be a collection of records (transactions), where each...
متن کاملMinimizing Spurious Patterns Using Association Rule Mining
Most of the clustering algorithms extract patterns which are of least interest. Such pattern consists of data items which usually belong to widely different support levels. Such data items belonging to different support level have weak association between them, thus producing least interested patterns which are of least interest. The reason behind this problem is that such existing algorithms d...
متن کاملAlgorithms for Association Rule Mining
Association Rule Mining (ARM) is one of the important data mining tasks that has been extensively researched by data-mining community and has found wide applications in industry. An Association Rule is a pattern that implies co-occurrence of events or items in a database. Knowledge of such relationships in a database can be employed in strategic decision making in both commercial and scientific...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1512.07806 شماره
صفحات -
تاریخ انتشار 2015